Hierarchical Text Classification as Sub-hierarchy Sequence Generation
نویسندگان
چکیده
Hierarchical text classification (HTC) is essential for various real applications. However, HTC models are challenging to develop because they often require processing a large volume of documents and labels with hierarchical taxonomy. Recent based on deep learning have attempted incorporate hierarchy information into model structure. Consequently, these implement when the parameters increase large-scale structure depends size. To solve this problem, we formulate as sub-hierarchy sequence generation target label instead Subsequently, propose Hierarchy DECoder (HiDEC), which decodes using recursive decoding, classifying all parents at same level children once. In addition, HiDEC trained use path from root each leaf in composed document via an attention mechanism hierarchy-aware masking. achieved state-of-the-art performance significantly fewer than existing benchmark datasets, such RCV1-v2, NYT, EURLEX57K.
منابع مشابه
Hierarchical Text Classification and Evaluation
Hierarchical Classification refers to assigning of one or more suitable categories from a hierarchical category space to a document. While previous work in hierarchical classification focused on virtual category trees where documents are assigned only to the leaf categories, we propose a topdown level-based classification method that can classify documents to both leaf and internal categories. ...
متن کاملOn Dataless Hierarchical Text Classification
In this paper, we systematically study the problem of dataless hierarchical text classification. Unlike standard text classification schemes that rely on supervised training, dataless classification depends on understanding the labels of the sought after categories and requires no labeled data. Given a collection of text documents and a set of labels, we show that understanding the labels can b...
متن کاملHierarchical Bayes for Text Classification
Naive Bayes models have been very popular in several classification tasks. In this paper we study the application of these models to classification tasks where the data is sparse i.e., a large number of possible outcomes do not appear in the data. Traditionally point estimates of the model parameters and in particular, point estimates based on the Laplace’s rule have been popular for such spars...
متن کاملAutomated Text Classification in the DMOZ Hierarchy
The growth in the availability of on-line digital text documents has prompted considerable interest in Information Retrieval and Text Classification. Automation of the management of this wealth of textual data is becoming an increasingly important endeavor as the rate of new material continues to grow at its substantial rate. The open directory project (ODP) also known as DMOZ is an on-line ser...
متن کاملPerformance measurement framework for hierarchical text classification
Hierarchical text classification or simply hierarchical classification refers to assigning a document to one or more suitable categories from a hierarchical category space. In our literature survey, we have found that the existing hierarchical classification experiments used a variety of measures to evaluate performance. These performance measures often assume independence between categories an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i11.26520